Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
-
Free, publicly-accessible full text available June 16, 2026
-
Free, publicly-accessible full text available March 31, 2026
-
Free, publicly-accessible full text available August 4, 2026
-
Free, publicly-accessible full text available January 20, 2026
-
The rising availability of heterogeneous networked devices highlights new opportunities for distributed artificial intelligence. This work proposes an Integer Linear Programming (ILP) optimization scheme to assign layers of a neural network in a distributed setting with heterogeneous devices representing edge, hub, and cloud in order to minimize the overall inference latency. The ILP formulation captures the tradeoff between avoiding communication cost when executing consecutive layers on the same device versus the latency benefit due to weight preloading when an idle device is waiting to receive the results of an earlier layer across the network. In our experiments we show the layer assignment and inference latency of a neural network can significantly vary depending on the types of devices in the network and their communications bandwidths.more » « less
An official website of the United States government
